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CLAIMS 

1 1. A method of determining parameter combinations for automated access to World Wide 

2 Web content that is accessible based on parameters resulting from real user interactions with a 

3 World Wide Web site, said method comprising: 

4 maintaining at least one log file containing at least one set of parameters resulting 

5 from real user interactions with said World Wide Web site; 

6 analyzing said log file to determine parameter combinations for automated access 

O to said World Wide Web content. 

O . 

Lil 2. A method of determining parameter combinations for automated access to World Wide 

8 . 

H Web content that is accessible based on parameters resulting from real user interactions with a 

H World Wide Web site, as per claim 1, wherein said parameters are entries in HTML forms, said 

14 analyzing step further comprising 



p| ranking entries in each set of entries according to their frequency of occurrence; 

6 for each set of entries resulting from unlimited text entries, excluding entries 

7 ranked below a predetermined number; and 

8 wherein said parameter combinations are determined by producing combinations 

9 of entries from each set of entries. 

13. A method of determining parameter combinations for automated access to World Wide 



2 Web content that is accessible based on parameters resulting from real user interactions with a 
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3 World Wide Web site, as per claim 2, wherein said parameter combinations are determined by 

4 producing all combinations of entries from each set of entries. 

1 4. A method of determining parameter combinations for automated access to World Wide 

2 Web content that is accessible based on parameters resulting from real user interactions with a 

3 World Wide Web site, as per claim 2, wherein entries resulting from limited text entries and 

4 unlimited text entries have stop words removed and remaining words stemmed. 

EJ 5, A method of determining parameter combinations for automated access to World Wide 

cij 

41 Web content that is accessible based on parameters resulting from real user interactions with a 

I .'i 

%i World Wide Web site, as per claim 1, wherein said log file is maintained by a proxy server that 

4 logs communications between a client and a Web server resulting from real user accesses to said 

til . 

h§ World Wide Web content. 

M 6. A method of determining parameter combinations for automated access to World Wide 

i y ■ 

2 Web content that is accessible based on parameters resulting from real user interactions with a 

3 World Wide Web site, as per claim 1, wherein said content is automatically accessed using said 

4 parameter combinations. 

1 7. A method of increasing web crawler penetration of Web databases accessible via HTML 

2 forms, said method comprising: 

3 reviewing previous real user queries; 
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4 identifying possible queries for said Web crawler from said previous real user 

5 queries by synthesis of entries for any of: predefined sets, limited text entries or unlimited 

6 text entries; and 

7 providing said identified queries to said Web crawler during an instantiation of 

8 automated access to said Web databases by said Web crawler. 



1 8. A method of increasing web crawler penetration of Web databases accessible via HTML 

L 2 forms, as per claim 7, wherein said previous user queries are maintained in a log file. 

1 

S 9. A method of increasing web crawler penetration of Web databases accessible via HTML 

g-. § 

2 forms, as per claim 8, wherein said log file is maintained by a proxy server. 

SJ 

CI 10. A method of increasing web crawler penetration of Web databases accessible via HTML 

1.4,. 

ff forms, as per claim 7, wherein said synthesis comprises: 



2 ranking any entries for predetermined sets; 

4 ranking any entries for limited text entries; 

5 ranking any entries for unlimited text entries; 

6 excluding entries for unlimited text entries ranked below a predetermined number; 

7 and 

8 pairing entries from each set of ranked entries. 

1 11. A method of increasing web crawler penetration of Web databases accessible via HTML 



2 forms, as per claim 10, wherein said synthesis further comprises: 
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removing stop words and stemming remaining words for entries resulting from 
limited text entries and unlimited text entries. 



1 12. A method of determining entries for input items of an HTML form for automated 

2 accesses to content contained in a Web database behind said HTML form, said method 

3 comprising: 

4 maintaining a log of real user entries for said input items; 

|| analyzing said log to determine entry combinations for said input items. 

Q 

FU 13. A method of determining entries for input items of an HTML form for automated 

Pi accesses to content contained in a Web database behind said HTML form, as per claim 12f 

jtg wherein said log file contains at least one set of entries, said analyzing step further comprising 



\4 ranking entries in each set of entries according to their frequency of occurrence; 

Q for each set of entries resulting from unlimited text entries, excluding entries 

ry 

6 ranked below a predetermined number; and 

7 wherein said automated parameter combinations are determined by producing 

8 combinations of entries from each set of entries. 

1 14. A method of determining entries for input items of an HTML form for automated 



2 accesses to content contained in a Web database behind said HTML form, as per claim 13, 

3 wherein said parameter combinations are determined by producing all combinations of entries 

4 from each set of entries. 
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1 15. A method of determining entries for input items of an HTML form for automated 

2 accesses to content contained in a Web database behind said HTML form, as per claim 13, 

3 wherein entries resulting from limited text entries and unlimited text entries have stop words 

4 removed and remaining words stemmed. 

1 16. A method of determining entries for input items of an HTML form for automated 



2 accesses to content contained in a Web database behind said HTML form, as per claim 12, 
pi wherein said log file is maintained by a proxy server that logs communications between a client 

P 

jl and a Web server resulting from real user accesses to said World Wide Web content. 

w 

yj- 

m 4 



Jjj 17. A method of emulating real user access to World Wide Web content dynamically 

g| accessible via an HTML form, said method comprising: 

ffk ■ ■ 

I* maintaining a log containing real user entries into each input item of said HTML 

hk 

MP form; 

5 ranking entries for each input item according to their frequency of occurrence; 

6 for each unlimited text entry input item, excluding entries ranked below a 

7 predetermined number; 

8 determining combinations of entries from each set of entries; and 

9 automatically accessing said content using said combinations of entries. 

1 18. A method of emulating real user access to World Wide Web content dynamically 

2 accessible via an HTML form, as per claim 17, wherein entries resulting from limited text entries 

3 and unlimited text entries have stop words removed and remaining words stemmed. 
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1 19. A method of emulating real user access to World Wide Web content dynamically 

2 accessible via an HTML form, as per claim 17, wherein said log file is maintained by a proxy 

3 server that logs communications between a client and a Web server resulting from real user 

4 accesses to said World Wide Web content. 

1 20. An article of manufacture comprising a computer usable medium having computer 

2 readable program code embed therein to determine parameter combinations for automated access 

•A to World Wide Web content that is accessible based on parameters resulting from user 

%?S ■ 

O 

jf interactions with a World Wide Web site, said computer readable program code comprising: 

j y 

Ui computer readable program code for maintaining at least one log file 

m 

"I representative of real user interactions with said World Wide Web site; 

EE 

f j 

3 computer readable program code for analyzing said log file to determine 

|p§ parameter combinations for automated access to said World Wide Web content. 

I 

1 21. An article of manufacture comprising a computer usable medium having computer 

2 readable program code embed therein to determine parameter combinations for automated access 

3 to World Wide Web content that is accessible based on parameters resulting from user 

4 interactions with a World Wide Web site, as per claim 20, wherein said parameters are entries in 

5 HTML forms, said computer readable program code for analyzing further comprising 

6 computer readable program code for ranking entries in each set of entries 

7 according to their frequency of occurrence; and 
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8 computer readable program code for each set of entries resulting from unlimited 

9 text entries, excluding entries ranked below a predetermined number; and 

10 wherein said parameter combinations are determined by producing combinations 

1 1 of entries from each set of entries. 

1 22. An article of manufacture comprising a computer usable medium having computer 

2 readable program code embed therein to determine parameter combinations for automated access 
P| to World Wide Web content that is accessible based on parameters resulting from user 
j| interactions with a World Wide Web site, as per claim 21, wherein said parameter combinations 



yj are determined by producing all combinations of entries from each set of entries. 



23. An article of manufacture comprising a computer usable medium having computer 

he. readable program code embed therein to determine parameter combinations for automated access 

s s 

II to World Wide Web content that is accessible based on parameters resulting from user 

4 interactions with a World Wide Web site, as per claim 21, wherein entries resulting from limited 

5 text entries and unlimited text entries have stop words removed and remaining words stemmed. 

1 24. An article of manufacture comprising a computer usable medium having computer 

2 readable program code embed therein to determine parameter combinations for automated access 

3 to World Wide Web content that is accessible based on parameters resulting from user 

4 interactions with a World Wide Web site, as per claim 20, wherein said log file is maintained by 

5 a proxy server that logs communications between a client and a Web server resulting from real 

6 user accesses to said World Wide Web content. 
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1 25. An article of manufacture comprising a computer usable medium having computer 

2 readable program code embed therein to determine parameter combinations for automated access 

3 to World Wide Web content that is accessible based on parameters resulting from user 

4 interactions with a World Wide Web site, as per claim 20, wherein said content is automatically 

5 access using said parameter combinations. 



m 
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